103 research outputs found

    Developing a manually annotated clinical document corpus to identify phenotypic information for inflammatory bowel disease

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Natural Language Processing (NLP) systems can be used for specific Information Extraction (IE) tasks such as extracting phenotypic data from the electronic medical record (EMR). These data are useful for translational research and are often found only in free text clinical notes. A key required step for IE is the manual annotation of clinical corpora and the creation of a reference standard for (1) training and validation tasks and (2) to focus and clarify NLP system requirements. These tasks are time consuming, expensive, and require considerable effort on the part of human reviewers.</p> <p>Methods</p> <p>Using a set of clinical documents from the VA EMR for a particular use case of interest we identify specific challenges and present several opportunities for annotation tasks. We demonstrate specific methods using an open source annotation tool, a customized annotation schema, and a corpus of clinical documents for patients known to have a diagnosis of Inflammatory Bowel Disease (IBD). We report clinician annotator agreement at the document, concept, and concept attribute level. We estimate concept yield in terms of annotated concepts within specific note sections and document types.</p> <p>Results</p> <p>Annotator agreement at the document level for documents that contained concepts of interest for IBD using estimated Kappa statistic (95% CI) was very high at 0.87 (0.82, 0.93). At the concept level, F-measure ranged from 0.61 to 0.83. However, agreement varied greatly at the specific concept attribute level. For this particular use case (IBD), clinical documents producing the highest concept yield per document included GI clinic notes and primary care notes. Within the various types of notes, the highest concept yield was in sections representing patient assessment and history of presenting illness. Ancillary service documents and family history and plan note sections produced the lowest concept yield.</p> <p>Conclusion</p> <p>Challenges include defining and building appropriate annotation schemas, adequately training clinician annotators, and determining the appropriate level of information to be annotated. Opportunities include narrowing the focus of information extraction to use case specific note types and sections, especially in cases where NLP systems will be used to extract information from large repositories of electronic clinical note documents.</p

    Automatic de-identification of textual documents in the electronic health record: a review of recent research

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the United States, the Health Insurance Portability and Accountability Act (HIPAA) protects the confidentiality of patient data and requires the informed consent of the patient and approval of the Internal Review Board to use data for research purposes, but these requirements can be waived if data is de-identified. For clinical data to be considered de-identified, the HIPAA "Safe Harbor" technique requires 18 data elements (called PHI: Protected Health Information) to be removed. The de-identification of narrative text documents is often realized manually, and requires significant resources. Well aware of these issues, several authors have investigated automated de-identification of narrative text documents from the electronic health record, and a review of recent research in this domain is presented here.</p> <p>Methods</p> <p>This review focuses on recently published research (after 1995), and includes relevant publications from bibliographic queries in PubMed, conference proceedings, the ACM Digital Library, and interesting publications referenced in already included papers.</p> <p>Results</p> <p>The literature search returned more than 200 publications. The majority focused only on structured data de-identification instead of narrative text, on image de-identification, or described manual de-identification, and were therefore excluded. Finally, 18 publications describing automated text de-identification were selected for detailed analysis of the architecture and methods used, the types of PHI detected and removed, the external resources used, and the types of clinical documents targeted. All text de-identification systems aimed to identify and remove person names, and many included other types of PHI. Most systems used only one or two specific clinical document types, and were mostly based on two different groups of methodologies: pattern matching and machine learning. Many systems combined both approaches for different types of PHI, but the majority relied only on pattern matching, rules, and dictionaries.</p> <p>Conclusions</p> <p>In general, methods based on dictionaries performed better with PHI that is rarely mentioned in clinical text, but are more difficult to generalize. Methods based on machine learning tend to perform better, especially with PHI that is not mentioned in the dictionaries used. Finally, the issues of anonymization, sufficient performance, and "over-scrubbing" are discussed in this publication.</p

    Endocrine therapy resistant ESR1 variants revealed by genomic characterization of breast cancer derived xenografts

    Get PDF
    To characterize patient-derived xenografts (PDXs) for functional studies, we made whole-genome comparisons with originating breast cancers representative of the major intrinsic subtypes. Structural and copy number aberrations were found to be retained with high fidelity. However, at the single-nucleotide level, variable numbers of PDX-specific somatic events were documented, although they were only rarely functionally significant. Variant allele frequencies were often preserved in the PDXs, demonstrating that clonal representation can be transplantable. Estrogen-receptor-positive PDXs were associated with ESR1 ligand-binding-domain mutations, gene amplification, or an ESR1/YAP1 translocation. These events produced different endocrine-therapy-response phenotypes in human, cell line, and PDX endocrine-response studies. Hence, deeply sequenced PDX models are an important resource for the search for genome-forward treatment options and capture endocrine-drug-resistance etiologies that are not observed in standard cell lines. The originating tumor genome provides a benchmark for assessing genetic drift and clonal representation after transplantation

    Endocrine-Therapy-Resistant ESR1 Variants Revealed by Genomic Characterization of Breast-Cancer-Derived Xenografts

    Get PDF
    To characterize patient-derived xenografts (PDXs) for functional studies, we made whole-genome comparisons with originating breast cancers representative of the major intrinsic subtypes. Structural and copy number aberrations were found to be retained with high fidelity. However, at the single-nucleotide level, variable numbers of PDX-specific somatic events were documented, although they were only rarely functionally significant. Variant allele frequencies were often preserved in the PDXs, demonstrating that clonal representation can be transplantable. Estrogen-receptor-positive PDXs were associated with ESR1 ligand-binding-domain mutations, gene amplification, or an ESR1/YAP1 translocation. These events produced different endocrine-therapy-response phenotypes in human, cell line, and PDX endocrine-response studies. Hence, deeply sequenced PDX models are an important resource for the search for genome-forward treatment options and capture endocrine-drug-resistance etiologies that are not observed in standard cell lines. The originating tumor genome provides a benchmark for assessing genetic drift and clonal representation after transplantation

    Observed Changes of Rain-Season Precipitation in China from 1960 to 2018

    No full text
    Precipitation during the main rain season is important for natural ecosystems and human activities. In this study, according to daily precipitation data from 515 weather stations in China, we analyzed the spatiotemporal variation of rain-season (May–September) precipitation in China from 1960 to 2018. The results showed that rain-season precipitation decreased over China from 1960 to 2018. Rain-season heavy (25 ≤ p &lt; 50 mm/day) and very heavy (p ≥ 50 mm/day) precipitation showed increasing trends, while rain-season moderate (10 ≤ p &lt; 25 mm/day) and light (0.1 ≤ p &lt; 10 mm/day) precipitation showed decreasing trends from 1960 to 2018. The temporal changes of precipitation indicated that rain-season light and moderate precipitation displayed downward trends in China from 1980 to 2010 and rain-season heavy and very heavy precipitation showed fluctuant variation from 1960 to 2018. Changes of rain-season precipitation showed clear regional differences. Northwest China and the Tibetan Plateau showed the largest positive trends of precipitation amount and days. In contrast, negative trends were found for almost all precipitation grades in North China Plain, Northeast China, and North Central China. Changes toward drier conditions in these regions probably had a severe impact on agricultural production. In East China, Southeast China and Southwest China, heavy and very heavy precipitation had increased while light and moderate precipitation had decreased. This result implied an increasing risk of flood and mudslides in these regions. The advance in understanding of precipitation change in China will contribute to exactly predict the regional climate change under the background of global climate change

    An Analysis of the Applicability of Conzenian School in China: Exemplified by Shangqiu

    Full text link
    [EN] Urban morphology has been studied extensively in western countries, while the related researches had been carried out late in China and the researches on urban morphology evolution characteristics of China are rare. Scanty case studies like Pingyao showed that the characteristics of urban form and its evolutionary process of China was different from the western. This paper review the existing researches and take Shangqiu as a case city to study urban morphology evolution characteristics of Chinese cities, preliminary analyze China s special social and economic characteristics as well as its urban morphology process from the double fringe belts, plot and block patterns of Shangqiu. Through the research on the evolution of Shangqiu s urban form, this paper aims to preliminary delimit the morphological period of Shangqiu, explore the evolution mechanism of Shangqiu, summarize evolutionary characteristics of Chinese cities and reflect urban morphological approaches in western academic system.Shen, Z.; Feng, X.; Cheng, S.; Shi, Y. (2018). An Analysis of the Applicability of Conzenian School in China: Exemplified by Shangqiu. En 24th ISUF International Conference. Book of Papers. Editorial Universitat Politècnica de València. 251-263. https://doi.org/10.4995/ISUF2017.2017.5683OCS25126
    • …
    corecore